perf(gui): Significantly reduce cost of 2d render elements#2514
perf(gui): Significantly reduce cost of 2d render elements#2514githubawn wants to merge 26 commits intoTheSuperHackers:mainfrom
Conversation
|
| Filename | Overview |
|---|---|
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Core of the PR: adds setup2DRenderState, onBeginBatch/onEndBatch/onFlush, and wires all draw* methods into the batch. Texture ref management via Get_Texture/Add_Ref/Release_Ref is correct across all batching/non-batching paths. |
| Generals/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Mirror of the GeneralsMD changes for the vanilla game build; identical logic and same minor style note applies. |
| GeneralsMD/Code/GameEngine/Source/GameClient/Display.cpp | Adds beginBatch/endBatch/flush base-class implementations with correct re-entrance guard. |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplayString.cpp | Moves hotkey text render after TheDisplay->flush() to commit the 2D batch before text rendering; no-op outside batch mode. |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DInGameUI.cpp | Wraps the InGameUI draw pass in beginBatch/endBatch; re-entrance guard prevents double-nesting. |
| Core/GameEngineDevice/Source/W3DDevice/GameClient/W3DView.cpp | Wraps iterateDrawablesInRegion with beginBatch/endBatch; clean integration. |
| GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h | Adds TextureClass forward declaration and new batch member fields; all initialized in the constructor. |
| GeneralsMD/Code/GameEngine/Include/GameClient/Display.h | Adds public batching API and protected virtual hooks; m_isBatching initialized in Display() constructor. |
Sequence Diagram
sequenceDiagram
participant Caller as W3DView / InGameUI
participant Disp as Display (base)
participant W3D as W3DDisplay
participant R2D as Render2DClass
Caller->>Disp: beginBatch()
Disp->>W3D: onBeginBatch()
W3D->>R2D: Reset()
note over W3D: m_batchNeedsInit=TRUE<br/>m_batchTexture=nullptr
loop Each draw* call
Caller->>W3D: drawImage / drawLine / ...
W3D->>W3D: setup2DRenderState(tex, mode, grayscale)
alt Same state as current batch
W3D-->>W3D: early-return (no flush)
else State changed
W3D->>W3D: onFlush()
W3D->>R2D: Render() + Reset()
W3D->>W3D: swap tex ref, update mode/grayscale
end
W3D->>R2D: Add_Quad / Add_Line / Add_Tri
end
Caller->>Disp: endBatch()
Disp->>W3D: onFlush()
W3D->>R2D: Render() + Reset()
Disp->>W3D: onEndBatch()
W3D->>W3D: REF_PTR_RELEASE(m_batchTexture)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Line: 2739
Comment:
**Long clipping condition on one line**
This is a very long line (~125 chars) that mixes all four boundary checks together. In the early-out block at the top of the function the same four conditions are already split across four indented lines — the inner clipping check would benefit from the same treatment for consistency and debuggability (each sub-condition can have its own breakpoint).
```suggestion
if (screen_rect.Left < m_clipRegion.lo.x ||
screen_rect.Right > m_clipRegion.hi.x ||
screen_rect.Top < m_clipRegion.lo.y ||
screen_rect.Bottom > m_clipRegion.hi.y)
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: Generals/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Line: 2630
Comment:
**Long clipping condition on one line (Generals mirror)**
Same long single-line clipping check as in the GeneralsMD copy — worth splitting onto four lines for consistency with the early-out block directly above it.
```suggestion
if (screen_rect.Left < m_clipRegion.lo.x ||
screen_rect.Right > m_clipRegion.hi.x ||
screen_rect.Top < m_clipRegion.lo.y ||
screen_rect.Bottom > m_clipRegion.hi.y)
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (21): Last reviewed commit: "implemented feedback" | Re-trigger Greptile
split single line IF
|
Significantly improves #109 |
optimize clipping check
remove obsolete else
return statement on seperate line
centralized flush
|
"One P1 render-state initialization bug in onBeginBatch() should be fixed before merging to avoid potential cross-frame visual corruption." The P1 finding could be a false positive based on a local variable analysis that omits the downstream hardware synchronization in Render2DClass::Render() (render2d.cpp:601). Render2DClass is a self-synchronizing state machine. Its Render() call (line 628 for textures, line 694 for shaders) explicitly pushes its internal members to the DX8Wrapper on every execution. The tracking variables in W3DDisplay (m_batchTexture, etc.) are mirror-images of Render2DClass's internal members. happy? |
|
No, the P1 finding stands. Your argument about Render2DClass being self-synchronizing addresses hardware state consistency, but not the C++ undefined behavior in W3DDisplay. The issue:
Even if Render2DClass correctly synchronizes hardware state on every
An uninitialized bool means these decisions are based on garbage. The fact that the hardware might recover downstream doesn't make reading uninitialized memory defined behavior. The fix is one line. Keep it. |
endBatch() is re-entrant-safe Clipping fast-path in drawImage remove Redundant m_isBatching = FALSE
…sGameCode into perf/gui-batch
|
I see greptile woke up on the sassy side of the motherboard. |
Another test would be to get a lot of units on screen then select them all so their unit info is being drawn. The health bars make use of the line drawing classes which i believe are inherently affected by this change. |
|
Compile errors need to be fixed before this can be reviewed |
33ccf35 to
5dd0e6e
Compare
5dd0e6e to
667d24d
Compare
xezon
left a comment
There was a problem hiding this comment.
This is very nice. Can you measure the performance improvement for just the 2D rendering on its own? I expect it will be somewhere north of 95%.
xezon
left a comment
There was a problem hiding this comment.
How much gains does the 2d rendering have now on its own?
|
Interestingly the replays are failing |
|
The replays look like an CI issue. My profiler is currently crashing, so I can't give you a function-level breakdown of the 2D render path alone. Best I have for now is indications: |
There was a problem hiding this comment.
You submitted a mysterious file here. Perhaps this fails the Replay check.


Reduces draw calls by batching 2D elements by texture state in setup2DRenderState.
Added early-out clipping and refined bounds checks to skip rendering objects outside the active region.
Optimized HotKey rendering by removing a redundant Render() call.